Skip to content

fix: use timezone-aware datetimes in cache _time_since_modification#2130

Open
Acuspeedster wants to merge 8 commits intooscal-compass:developfrom
Acuspeedster:fix/use-timezone-aware-datetime-in-cache
Open

fix: use timezone-aware datetimes in cache _time_since_modification#2130
Acuspeedster wants to merge 8 commits intooscal-compass:developfrom
Acuspeedster:fix/use-timezone-aware-datetime-in-cache

Conversation

@Acuspeedster
Copy link
Contributor

Use timezone-aware datetime objects in FetcherBase._time_since_modification in cache.py to prevent incorrect cache expiry calculations in DST-observing environments.

Changes:

  • trestle/core/remote/cache.py: _time_since_modification replace datetime.datetime.fromtimestamp(mtime) and datetime.datetime.now() (both naive) with datetime.datetime.fromtimestamp(mtime, tz=datetime.timezone.utc) and datetime.datetime.now(datetime.timezone.utc) (both UTC-aware)

Types of changes

  • Hot fix (emergency fix and release)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Documentation (change which affects the documentation site)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Release (develop -> main)

Quality assurance (all should be covered).

  • My code follows the code style of this project.
  • Documentation for my change is up to date?
  • My PR meets testing requirements.
  • All new and existing tests passed.
  • All commits are signed-off.

Summary

FetcherBase._time_since_modification calculates how long ago a cached file was last modified, and _is_stale uses this value to decide whether to re-fetch a remote OSCAL resource.

The bug: datetime.datetime.fromtimestamp(mtime) and datetime.datetime.now() both return naive datetime objects (no tzinfo). On systems in DST-observing timezones, the OS filesystem timestamp is stored in UTC, but fromtimestamp converts it to local wall-clock time. When a DST transition occurs between the time the file was written and the time _is_stale is called, the two naive datetimes can be in different UTC offsets. The subtraction then silently produces a timedelta that is off by exactly one hour causing a cache hit to be treated as stale (or a stale cache to be treated as fresh) for the duration around the DST boundary.

The fix: pass tz=datetime.timezone.utc to both calls so that both operands are timezone-aware and refer to the same UTC reference, making the subtraction always correct regardless of local timezone or DST transitions.

This is the same class of datetime handling issue that was fixed in model_age by #2110.

Key links:

Before you merge

  • Ensure it is a 'squash commit' if not a release.
  • Ensure CI is currently passing
  • Check sonar. If you are working for a fork a maintainer will reach out, if required.

Signed-off-by: Acuspeedster <arnavrajsingh@gmail.com>
@Acuspeedster Acuspeedster requested a review from a team as a code owner March 10, 2026 09:34
@Acuspeedster
Copy link
Contributor Author

Acuspeedster commented Mar 11, 2026

@degenaro @vikas-agarwal76
This requires a review to be merged.

@Acuspeedster
Copy link
Contributor Author

@degenaro I think you have already gone through this code change... this pr just needs to be merged.

@Acuspeedster
Copy link
Contributor Author

@degenaro @vikas-agarwal76 reminder
Thanks

Copy link
Collaborator

@degenaro degenaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is attempting to standardize time to utc.
Questions:

  • what callers are affected?
  • is similar change needed elsewhere in the code base for consistency?
  • will this have any backward compatibility issues?
  • will this change have any user facing impact?

@Acuspeedster
Copy link
Contributor Author

Hey @degenaro
Great questions, thanks for reviewing.

  1. What callers are affected?
  • This affects paths that rely on FetcherBase._is_stale() via _time_since_modification() in trestle/core/remote/cache.py.
  • Concretely, that means remote fetch workflows (https:// and sftp://) using FetcherFactory:
    • trestle/core/commands/import_.py (fetcher.get_oscal(True))
    • profile import resolution in trestle/core/resolver/_import.py (fetcher.get_oscal())
    • repository import path in trestle/core/repository.py (load_and_import_model)
    • SSP inheritance fetch in trestle/core/crm/ssp_inheritance_api.py
  • LocalFetcher is not affected by this staleness math because it always returns stale (_is_stale -> True) and bypasses age subtraction logic.
  1. Is similar change needed elsewhere for consistency?
  • I did a codebase scan for fromtimestamp(...) / naive now() usage.
  • For this specific bug class (filesystem mtime age calc + DST-sensitive subtraction), cache.py is the production path that needed this fix.
  • Similar timezone-hardening was already done for model age (ModelUtils.model_age, PR fix: use total_seconds() in model_age to handle multi-day deltas #2110 class of fix).
  • There are still some naive datetime.now() calls elsewhere, but those are mostly elapsed-time/logging use (e.g., transform timing), not timezone-reference comparisons tied to cache expiry.
  1. Backward compatibility impact?
  • No API/CLI/schema contract change.
  • Return types and behavior contracts are unchanged.
  • The only behavior change is correctness around DST boundaries (removing potential +/-1 hour skew in staleness calculation).
  1. User-facing impact?
  • Yes, but only as a correctness improvement:
    • fewer unnecessary refetches when cache is actually fresh
    • avoids incorrectly treating stale cache as fresh around DST transitions
  • No new flags/options and no change to model output format.

Net: this is a low-risk bug fix that standardizes cache age calculations to UTC-aware datetimes, consistent with prior timezone-hardening work.

@degenaro
Copy link
Collaborator

@Acuspeedster Thx.

Related to testing, I see for coverage:

While we are here, can we improve the coverage?

Signed-off-by: Acuspeedster <arnavrajsingh@gmail.com>
Signed-off-by: Acuspeedster <arnavrajsingh@gmail.com>
@Acuspeedster
Copy link
Contributor Author

Hey @degenaro
Thanks for the suggestion on coverage.

I added a focused unit test for the timezone fix:

  • tests/trestle/core/remote/cache_test.py::test_time_since_modification_uses_utc

This test explicitly validates that _time_since_modification() uses UTC-aware datetime calls (fromtimestamp(..., tz=UTC) and now(UTC)) and checks the resulting timedelta.

Test run:
python -m pytest tests/trestle/core/remote/cache_test.py -k "time_since_modification_uses_utc"
Result: 1 passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants